Introduction: Lymphoma represents a diverse group of cancers with heterogeneous presentations, making treatment decisions complex. While randomized trials are foundational, they often exclude the unique patient situations detailed in case reports. Vast amount of evidence from case reports remains underutilized due to the labor-intensive process of manual classification and data extraction. We aimed to overcome this challenge by developing and validating a large-scale, automated analysis of published lymphoma case reports using a generative artificial intelligence (AI) extraction pipeline.

Methods: A comprehensive search of the PubMed database was performed using relevant MeSH terms to identify all potential lymphoma case reports. To extract key clinical information, a structured questionnaire with 51 items covering patient demographics, lymphoma type, diagnostic procedures, disease location, treatments, and outcomes was created. The Large Language Model (LLM) Rombos-LLM-V2.6-Qwen-14b deployed with the data-element-extractor Python package was used to systematically answer these questions for each publication's title and abstract. The LLM's performance was benchmarked against a manually labeled ground truth dataset of 298 reports before conducting a quantitative analysis of all extracted data.

Results: The search yielded a total of 10,681 publications. On the validation dataset, the model achieved 96.1% overall accuracy and an F1-score of 80.1% with F1>90% in 13 questions. The LLM analysis identified 3,347 of the 10,681 publications as case reports on individual patients with lymphoma. Of these patients, sex was identified as male in 40.8% (n=1364) and female in 30.4% (n=1016). Patient age was identified in 2,898 cases (86.6%), with a median of 52 years (range 0-100) and 17.2% of patients being younger than 18 years. Hodgkin lymphoma was identified in 224 cases (11.1%). A PET-CT scan was reported in 117 patients (3.5%) and a bone marrow biopsy in 113 patients (6.7%). Identified organs most frequently involved by lymphoma included skin (n=272, 8.1%), gastrointestinal tract (n=264, 7.9%), central nervous system (n=222, 6.6%), eye (n=100, 4.1%), liver (n=77, 2.3%), oral cavity (n=73, 2.2%) and spleen (n=66, 2.0%). Mention of therapies included chemotherapy in 668 cases (20.0%), radiotherapy in 297 cases (8.9%), and surgery in 119 cases (3.6%). The patient's death was mentioned in 541 cases (16.2%).

Conclusions: This is the first study to systematically analyze all lymphoma case reports published on PubMed using generative AI. Our findings demonstrate that LLMs can accurately and efficiently extract diverse clinical data from scientific literature at a massive scale. This approach provides a powerful method for creating living case report registries, which can supply detailed clinical evidence to enhance our understanding of the complex and heterogeneous landscape of lymphoma.

This content is only available as a PDF.
Sign in via your Institution